Learning a Hierarchical Deformable Template for Rapid Deformable Object Parsing Citation
نویسندگان
چکیده
In this paper, we address the tasks of detecting, segmenting, parsing, and matching deformable objects. We use a novel probabilistic object model that we call a hierarchical deformable template (HDT). The HDT represents the object by state variables defined over a hierarchy (with typically five levels). The hierarchy is built recursively by composing elementary structures to form more complex structures. A probability distribution—a parameterized exponential model—is defined over the hierarchy to quantify the variability in shape and appearance of the object at multiple scales. To perform inference—to estimate the most probable states of the hierarchy for an input image—we use a bottom-up algorithm called compositional inference. This algorithm is an approximate version of dynamic programming where approximations are made (e.g., pruning) to ensure that the algorithm is fast while maintaining high performance. We adapt the structure-perceptron algorithm to estimate the parameters of the HDT in a discriminative manner (simultaneously estimating the appearance and shape parameters). More precisely, we specify an exponential distribution for the HDT using a dictionary of potentials, which capture the appearance and shape cues. This dictionary can be large and so does not require handcrafting the potentials. Instead, structure-perceptron assigns weights to the potentials so that less important potentials receive small weights (this is like a “soft” form of feature selection). Finally, we provide experimental evaluation of HDTs on different visual tasks, including detection, segmentation, matching (alignment), and parsing. We show that HDTs achieve state-of-the-art performance for these different tasks when evaluated on data sets with groundtruth (and when compared to alternative algorithms, which are typically specialized to each task).
منابع مشابه
Learning a Hierarchical Log-Linear Model for Rapid Deformable Object Parsing
In this paper, we address the problems of detecting, segmenting, parsing, and matching deformable objects. We propose a novel hierarchical log-linear model (HLLM) which represents both shape and appearance features at multiple levels of a hierarchy. This enables us to combine appearance cues at multiple scales and to model shape deformations at a range of scales. We provide a bottom-up algorith...
متن کاملMax-Margin Learning of Hierarchical Configural Deformable Templates (HCDT) for Efficient Object Parsing and Pose Estimation
In this paper we formulate a hierarchical configurable deformable template (HCDT) to model articulated visual objects – such as horses and baseball players – for tasks such as parsing, segmentation, and pose estimation. HCDTs represent an object by a AND/OR graph where the OR nodes act as switches which enables the graph topology to vary adaptively. This hierarchical representation is compositi...
متن کاملRobust Tracking of Stochastic Deformable Models in Long Image Sequences
In this paper, we describe a method for the temporal tracking of stochastic deformable models in long image sequences. The object representation relies on a hierarchical statistical description of the deformations applied to a template. A bayesian estimate of the deformations is obtained by maximizing a highly non-linear joint probability distribution. Time consuming global (stochastic) optimiz...
متن کاملDeformable Object Tracking Using the Boundary Element Method
This paper presents a method to perform 2D deformable object tracking using the boundary element method (BEM). BEM, like the finite element method (FEM), is a technique to model an elastic solid. BEM differs from FEM in that only the contour of an object needs to be meshed for BEM, making this method attractive for computer vision problems. For FEM, the interior of the object must be meshed als...
متن کاملDeformable 3D Reconstruction with an Object Database
Deformable 3D reconstruction from 2D images requires prior knowledge on the scene structure. Template-free methods [1, 2, 5, 6, 9, 14] use generic prior knowledge such as piecewise smoothness but require multiple images with significant baseline. Template-based methods [4, 10, 13] require only one image but handle only one object for which they need specific prior knowledge, namely a 3D templat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010